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Abstract 

In this paper a nnmerical procednre to simulate low diffusivity scalar tur¬ 
bulence is presented. The method consists of using a grid for the advected 
scalar with a higher spatial resolutions than that of the momentum. The 
latter usually requires a less rehned mesh and integrating both helds on a 
single grid tailored to the most demanding variable, produces an unneces¬ 
sary computational overhead. A multiple resolution approach is used also 
in the time integration in order to maintain the stability of the scalars on 
the hner grid. The method is the more advantageous the less diffusive the 
scalar is with respect to momentum, therefore it is particularly well suited for 
large Prandtl or Schmidt number flows. However, even in the case of equal 
diffusivities the present procedure gives CPU time and memory occupation 
savings. The reason is that the absence of the pressure term in the scalar 
equation leads to much steeper gradients in the scalar held as compared to 
the velocity held. 

Keywords: Multiple resolution. Direct Numerical Simulation, Scalar 
turbulence. 


1. Introduction 

Countless phenomena in Nature and technology involve one or more scalar 
helds that are advected and dihused by a turbulent how. The dilution of 
pollution in the atmosphere [T], the transport of nutrients in oceans j2], the 
cooling or heating of devices [3] and the buoyancy-driven currents generated 
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by natural- HE] and double-diffusive convection are just few examples 
among many. 

Numerical simulations have proven to be very helpful in unravelling the 
complex physics behind these phenomena |8] even if the calculations have 
shown to be more demanding than expected, taking up to millions of CPU 
hours in recent studies [S]. 

The common understanding of the problem is that in three-dimensional 
turbulent flows, there is a cascade from the largest towards the small spatial 
scales up to a lower limit that is determined by the diffusivity. As each held 
has its own diffusivity, these scales can have different magnitudes. In direct 
numerical simulation (DNS) the mesh size must be smaller than the smallest 
among them: This requirement quickly renders DNS infeasible. Denoting as 
rjK the smallest (Kolmogorov) scale of the momentum held, we can calculate 
the analogous quantity for a scalar held S as tjb = also called the 

Batchelor scale, with Sc = v/ks the Schmidt number dehned as the ratio 
of the kinematic viscosity v and the scalar dihusivity K 5 , respectively. In 
some cases, like sugar in water, the Schmidt number exceeds 10^ resulting 
in a Batchelor scale of rjB — 777 ^-/ 30 . With equal grid resolutions for the 
scalar and the momentum helds, this entails that the momentum held is 
overresolved by a factor of approximately 30 in each direction. The problem 
is exacerbated by the fact that a scalar is described by only a single quantity, 
while momentum is a vector held satisfying the incompressibility condition 
or other related constraints. This implies that the solution of the momentum 
alone generally takes an order of 90% of the total CPU time of a simulation 
and therefore resolving it on an unnecessary hne mesh is not desirable. 

The above scenario essentially derived from dimensional analysis, how¬ 
ever, does not give the complete picture since it does not account for the 
structure of the equations. In fact, the naive comparison between the Kol¬ 
mogorov and Batchelor scales suggests that for Sc ~ 1, ~ rjB although 

in practice the resolution requirements for the momentum and the scalar 
helds are not the same. Visual evidence of the latter statement can be 
obtained from the instantaneous snapshots of hgure showing horizontal 
cross-sections of temperature and vertical velocity in a thermally driven tur¬ 
bulent how. In this how, the huid hotter than the average temperature (0.5 
in nondimensional units) generates upward buoyancy and therefore positive 
vertical velocity (and vice versa). Although the two helds are very well cor¬ 
related on the large scales, the sharp fronts of the scalar held do not have 
an analogous counterpart in the momentum distribution and this results in a 
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different resolution requirement for scalar and momentum. We will see that 
the main reason for this difference is the presence of the pressure term in 
the momentum equation that makes the dynamics non-local and tends to 
smooth the intense steep fronts. A problem related to the non-local equa¬ 
tions for momentum is their high computational cost and the detrimental 
implications on the parallel performance. Therefore, the possibility of us¬ 
ing different meshes for momentum and scalars opens the door to very large 
gains in performance, not only by reducing the amount of operations, but 
also the communication of data among processors and the total memory us¬ 
age. This consideration motivates the present paper that describes a strategy 
for efficiently simulating scalar driven turbulent flows with different spatial 
resolutions for momentum and scalar helds. 



Figure 1: A horizontal plane halfway between the plates for a Rayleigh-Benard simulation 
in a Cartesian geometry at Ra = 10^*^ and Prandtl number Pr = 1. a) vertical velocity, 
red indicating rising fluid while blue indicates falling fluid, b) temperature, red indicating 
hot fluid and blue indicating cold fluid. Even though the Prandtl number is one, much 
sharper gradients can be seen in the right panel. 

In this study, the multiple resolution strategy will be mainly applied to 
Rayleigh-Benard (RB) convection, the flow of a fluid vertically conhned by a 
top cold and a bottom hot plate. RB is a particularly suitable example for the 
present application since the flow is driven by the temperature (scalar) held 
whose diffusivity can be changed by the Prandtl number Pr. In addition, 
analytical exact relations are available for this problem that can be used to 
check the correctness of the numerical results. It is worth mentioning that in 
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RB convection the forcing conies from the heated and cooled surfaces where 
viscous and thermal boundary layers develop. Since they become thinner as 
the forcing strengthens, the resolution requirements in these boundary layers 
become more stringent than in the bulk. An extensive analysis of the problem 
can be found in Ref. HDl where all the details, estimates and guidelines for 
numerical simulations are given. Here, it suffices to mention that a non- 
uniform mesh is required in the wall normal direction such to cluster the 
nodes within the boundary layers. Nevertheless, even if the grid spacing at 
the wall is much hner than that in the bulk, the volume of fluid within these 
layers is at most a few percent of the total and the nodes allocated there are 
only a small fraction of the whole mesh. 

We will additionally show another numerical example. Namely, double 
diffusive convection (DDC), in which the flow is driven by two scalars with 
very different diffusivities and with opposite, stabilizing and destabilizing, 
effects on the flow. In this case the multiple resolution strategy is even more 
advantageous and it allows for the simulation of flow regimes that otherwise 
would be out of reach. 

The organization of the paper is the following. In the next section we 
describe the governing equations and the numerical method. Section [^quan- 
tihes the differences of momentum and scalar gradients and presents some 
analytical exact relations for RB. The section closes with the results of refer¬ 
ence simulations obtained on a standard single grid. In Section |^the multiple 
resolution strategy and numerical details are explained. Finally, Section 
discusses the results and the computational performance of the method for 
RB flow and double diffusive convection. Closing remarks are reported in 
section E 

2. Governing equations and numerical method 

The incompressible Navier-Stokes equations with the Boussinesq approx¬ 
imation for thermal convection, in nondimensional form, read: 


V ■ u = 0, 


( 1 ) 
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( 3 ) 


RaPr 

where u is the velocity, p the pressure, 6 the temperature (rescaled such that 
it is one at the hot plate and zero at the cold plate) and the unitary 
vector in the anti-parallel direction to gravity, which is also the plate-normal 
direction. The Rayleigh number Ra is the non-dimensional temperature 
difference defined as Ra = gfixiTh — T^Li^/ ktI', where v is the kinematic 
viscosity, (3t the isobaric thermal expansion coefficient and kt the thermal 
diffusivity of the fluid, g is the gravity, L the distance and and the 
temperature at the hot and cold plates, respectively. The Prandtl number 
Fr, which is the temperature analogue of the Schmidt number, is defined as 
Pr = v/kt- 

The integration of the above system is performed by a fractional timestep- 
ping m with the modihcations proposed in Ref. [12]. In short, a provisional 
velocity Ui is computed from the previous held u” using the old pressure p^ 


Uj - uf 

At 


-dip'^ - 


( 4 ) 


non-linear terms and the temperature forcing while 
jjn+i /2 viscous terms: The former are computed explicitly in time, 

the latter implicitly. The how incompressibility is then enforced by a pres¬ 
sure correction that takes the form of a Poisson equation = diUi whose 
solution is the most computationally demanding step, especially on non- 
uniform grids. In addition, the Poisson equation is non-local and this has 
consequences on code parallelization, requiring the largest amount of com¬ 
munication. Once the scalar 0 is obtained, the velocity is projected onto 
the solenoidal held and the new pressure can be computed. The 
advancement of the temperature is performed directly through 


gn+l _ QT 

At 


= _|_ yn+ll2 


( 5 ) 


where, again, contains the explicit nonlinear terms and the 

implicit dihusive terms. 

All the variables are discretized by central second-order hnite-diherences 
on a staggered grid and the time advancement of the solution is obtained 
by a low-storage third-order Runge-Kutta scheme. Further details can be 
found in Ref. [T2] . 
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3. Pressure effects and heat transfer in RB 

As mentioned in the introduction, the Prandtl number gives the ratio of 
momentum to thermal diffusivity, and even if Pr is of order unity, temper¬ 
ature and momentum do not have the same gradient magnitudes (cf. hgure 
[^. We quantify this statement in hgure a) by showing instantaneous tem¬ 
perature 6 and vertical velocity Uz prohles across a vertical line from a RB 
simulation: Much steeper gradients, almost ‘shock’ like, can be seen in the 
temperature held. These steep gradients are smoothed in the vertical veloc¬ 
ity, owing to the pressure ehects and this lowers the resolution requirements 
of momentum with respect to temperature. This observation is further cor¬ 
roborated by hgure showing the probability density functions of dzO and 
dzUz computed in the bulk of the how without the boundary layers. Ex¬ 
treme gradients can be seen to be more likely for 6 thus evidencing a more 
intermittent behaviour. This behaviour has been extensively studied, and it 
is a well established fact that the intermittency corrections to the structure 
function exponents are much larger for scalars than for velocity [I3l[ll]. In¬ 
deed, these shock-like fronts become much sharper with increasing Reynolds 
number, and thus increasing small-scale intermittency nansiiiTiiiHmniEn]. 




Figure 2: a) Instantaneous 9 and Uz profiles as a function of the vertical coordinate z/L 
for Ra = 10^°, Pr = f RB simulation, h) PDF of dzd and dzUz for the same simulation 
in the bulk. 

We have argued that the absence of the pressure term in the scalar equa¬ 
tion is the reason for the localized steep gradients and this idea can be con- 
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firmed by the Burgers equation: 

dtu + udxU = usdxxU, ( 6 ) 

which is often used to test numerical schemes since it is a simple one¬ 
dimensional partial differential equation that still retains the essential fea¬ 
tures of the more complex Navier-Stokes equations: The unsteady term, a 
quadratic non-linear term and a strictly dissipative viscous term IZH. 

A one-to-one comparison with equation (|^, however, evidences the ab¬ 
sence of a pressure term and this causes the solution u{x,t) to develop 
‘shock’-like discontinuities in a hnite time and for hnite values of the dif- 
fusivity ub- 

As an example, in hgure a solution obtained for an initial condition 
u{x, 0) = sinx in the domain —n < x < tt and ub = 5 x 10“® is shown. The 
gradient at x = 0 becomes so steep that a hne nonuniform mesh clustered at 
the centre of the domain is necessary to properly capture the solution (hgure 
l^a). For under-resolved meshes (hgure [^6) the solution unphysically over¬ 
shoots the initial extrema Umin = — 1 and Umax = 1 and spurious oscillations 
are generated in the region of the steep gradient (hgure |^c) resembling the 
behaviour of underresolved scalar helds. A very similar phenomenon is ob¬ 
served in a simulation with an underresolved temperature held as shown in 
hgure 1^ 

In RB how, one of the interesting quantities is the heat hux Q transferred 
from one plate to the other. In non-dimensional form this is the Nusselt 
number, Nu = Q/K,T{Th—Tc)L~^. This is not only interesting from a physical 
point of view, but also as a monitoring variable since it has been observed [22] 
that when the separately calculated Nusselt number converge and are grid 
independent, at least all quantities up to second order statistics are properly 
resolved. 

There are several ways to calculate Nu, either by measuring the convec¬ 
tive heat transport in the system 

Nu{z) = VRaPr{uz6)A,t + 1 , ( 7 ) 

or by using the exact relationships derived from global balances [23] of kinetic 
energy, 

= uU‘^fL-‘^{[diUj]‘^)v,t = u^L-^{Nu - l)RaPr-^, ( 8 ) 

and thermal energy, 

e, = Krin - T,yL-^{[d,e]^)v,t = KriU - T,fL-^Nu. (9) 
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Figure 3: Spatial profiles at various times of the solution of the Burgers equation for 
= 5 X 10“®: dashed t — 0, dotted t = 0.2, chaindot t = 0.4 and solid t = 0.8. a) 
resolved simulation with Ax = 10“"*, b) underresolved simulation with Ax = 4 x 10“^, c) 
enlargement of b) in the region of the steep gradient. 


Here the subscripts t, A and V denote, respectively, averages in time, hori¬ 
zontal homogeneous planes and the whole fluid volume, and Uf the free-fall 
velocity Uf = PgiT^ — Tc)L. Although equation ([^ depends on z, once 
the equilibrium is reached its value becomes constant. This condition is 
requested to assess the statistical convergence of the results. 

From here on, we denote Nu with a subscript, either Nuu^e, or 

Nueg which specifies the particular equation, i.e. 0-0 respectively, with 
which Nu is calculated. In addition, we also denote the Nusselt number 
calculated by the temperature gradient at the wall as Nuq^. All definitions 
are equivalent analytically, but they involve gradients or square gradients 
of the variables that, when calculated numerically, can deviate from each 
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Figure 4: a) Snapshot of an underresolved flow field for Ra = 10® and Pr = 1. The 
characteristic “wiggles” of the underresolved temperature field can be clearly appreciated 
(see arrows), b) Snapshot of a properly resolved velocity field. 


other if the simulations do not have enough spatial resolution to capture the 
smallest flow scales: Their comparison can thus be used as a test for the 
adequacy of the mesh izg. 

Figures]^ a) and[^&), show the ratios Nu^g/Nuu^e and Nu^^/Nuu^e for 
RB simulations performed on the same grid for momentum and temperature 
at Ra = 10® and Pr = 1 or Pr = 10. Resolutions between 96^ x 192 and 
384^ X 768 were used, the larger number corresponding to the vertical (wall- 
bounded) direction. An aspect ratio of F = 1 was used in both directions, 
meaning that the computational domain is cubic. Points were clustered near 
the wall using a Chebychev-like distribution according to the prescriptions 
of Ref. [in]. The Kolmogorov scale is computed from the kinetic energy 
dissipation rate using r]K/L = and the Batchelor scale t]b/L = 

r]K/LPr~^^'^. As mentioned above, the various expressions for Nu should 
be equivalent, their ratios however, approach the unity limit only when the 
normalized mesh sizes ^/rjK and ^/rjB decrease and they do not converge 
at the same rate. In particular it can be noted that Nu^^/Nuu^e tends to 
unity for larger grid spacings than Nu^g/Nuu^e even for Pr = 1 and this 
corroborates our hypothesis that a hner resolution is needed for the scalar 
than for momentum. Using an identical mesh to spatially discretize both 
momentum and the scalars therefore produces an overhead in computational 
resources that is redundant. 
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At the highest resolution, all the dehnitions converge to the same value 
(within an uncertainty of 2-3%), therefore we will refer to it as Nuref with¬ 
out specihng the particular expression and hgure [^c) and l^d) show the 
convergence of Nuu^e to this asymptotic value. 





Figure 5: (a) ratio between different ways of calculating Nu against grid size for a rect¬ 
angular RB simulation with Ra = 10® and Pr = 1. Red squares are Nue^/Nuu^s, blue 
circles are Nu^g/Nuu^e- {b) same as (a) for Pr = 10. Nu^g is plotted against max(A/? 7 B) 
while Nug^ is plotted against max(A/77B). For Pr = 1, 77 ^ = rjK- c) convergence to an 
asymptotic value of Nuu^e for Ra = 10® and Pr = 1. (d): same as {c) for Pr = 10. 


4. The multiple resolutions strategy 

%i. Multiple resolution strategy in space 

In this subsection we present a method to decouple the spatial discretiza¬ 
tion of the scalars and the momentum, which allows for large computational 
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savings. This is achieved by rehning every cell of a base mesh Af* times in 
each Tth direction. A simplihed two-dimensional sketch of this procedure is 
shown in hgure On the left, the location of the scalars and velocities in 
the standard single mesh is shown for a staggered arrangement. There, the 
velocity components are at the centres of the cell faces, while the pressure 
and the scalar are discretized at the centre of the cell volume. The right 
panel shows a case with velocity and pressure on the base grid, and a doubly 
rehned (Al^ = = 2) mesh for the scalar, which is temperature in the 

RB case. 

The method works as follows. We hrst generate the rehned mesh over 
which the scalar held is discretized and then a coarser mesh is obtained by 
taking only one out of Al* nodes in the Tth direction. Note that when the 
mesh is uniform in space this is equivalent to start from the coarse cells 
and split them into Al* identical parts. For a non-uniform mesh, however, 
this naive splitting would result in a staircase distribution of the metrics for 
the hne grid with constant coefficients within each coarse element and with 
jumps across its boundaries. These discontinuities would locally decrease the 
accuracy of the discretization to hrst order and introduce spurious oscilla¬ 
tions in the resolved helds. The diherence between the two methods is shown 
in hgure a) for a mesh obtained by a Chebychev-like distribution with 96 
base nodes and a rehnement factor of eight. In this chapter, most numer¬ 
ical examples are obtained using the same Al* in every direction therefore, 
unless otherwise specihed, from here on we will use only A1 to indicate the 
isotropic rehnement factor without specifying the direction. However, this is 
not required for the method to work and the same procedure can be applied 
to rehnement levels diherent in each direction depending on the particular 
how physics. An example will be shown in §5.2 

In order to advect the scalar, a velocity held has to be projected from 
the base mesh onto the rehned grid in a staggered arrangment with respect 
to the scalar. A hrst straightforward possibility consists of using a tri-cubic 
Hermite spline interpolation, with a stencil of four points in every direction, 
for a total of sixtyfour points in three dimensions which, according to our 
numerical tests, is the minimum required. At the top and bottom boundaries, 
one-sided interpolation is used, which in principle is less accurate, but it 
is performed on a much hner grid, therefore the amplitude of the error is 
much smaller than in the bulk. The accuracy of Hermitian interpolation 
has proven to be sufficient in our turbulent hows, and it is comparable to 
that of B-splines m- Hermitian interpolation, however, is preferred in this 
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Figure 6: Location of pressure, temperature and velocities of a 2D simulation cell. The 
third dimension (z) is omitted for clarity. As on an ordinary staggered scheme, the velocity 
vectors are placed on the borders of the cell and pressure is placed in the cell center. The 
temperature is placed on the same nodes as the vertical velocity, to ensure exact energy 
conservation. 

method as B-splines are much more computationally expensive. Preliminary 
simulations have shown that a linear interpolation using a two point stencil, 
though computationally cheaper, is not adequate since it results in a spatially 
interpolated velocity field which has equal gradients for all the refined points 
inside every base cell, and discontinuities at the base cell boundaries (figure 
[^6)). This lack of homogeneity results in spurious oscillations in the scalar 
field, in particular around local maxima and minima of velocity. An example 
of these oscillations is shown in figure with the checkboard pattern given 
by the footprint of the base mesh. 

The spatially interpolated velocity can then be used to advance equation 
(|^, and compute the values of the scalars at the new time. If a scalar couples 
back to the momentum field, like in the case of RB flow, a spatial filter must 
be applied to calculate a “coarsened” scalar. In this case, an averaging using 
equal weights within each refined cell is used. This averaged scalar is then 
used in equation ([^ to advance momentum and pressure. Notice that in 
this case the scalar field is interpolated from a fine mesh onto a coarser one, 
therefore the previous problem of having equal fluxes between neighbouring 
cells is not encountered. 

We stress that the interpolation of a velocity field between two different 
grids is a very dangerous operation since its effect is equivalent to that of a 
low-pass filter, which usually leads to loss of energy and generation or de¬ 
struction of information. This is catastrophic in the DNS of turbulent flows 
where the dynamics are based on the energy cascade through the scales. It 
has even more serious consequences in a RB flow where the balance between 
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Figure 7: (o) Normalized metric Ax/N^, where is the amount of grid points for a 
Chebychev-type grid clustering for a refined-mesh generated {M = 8) from a base mesh 
by splitting the base cells (squares), and the base mesh generated from the refined mesh 
(diamonds). The first method causes a staircase-like metric, which leads to spurious oscil¬ 
lations. (&) Comparison on interpolated velocity v from a base mesh to the refined mesh 
{M = 8) using linear interpolation (squares) and cubic Hermite splines (diamonds). Both 
interpolations coincide at the base mesh points. The underlying basis for the interpolation 
is also plotted. 
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Figure 8: (a) Pseudocolour plot of temperature at the mid-height when using a refinement 
of = 3 and linear interpolation for the velocities. Spurious oscillations every three 
points can be seen all over the domain. (&) Zoom-in of the region inside the black square 
of (a). 
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thermal (potential) and kinetic energy determines the heat transfer. Never¬ 
theless, if the base mesh already captures the smallest momentum structures, 
the held is smooth already at the scale of the base cell and an interpolation 
kernel that is continuous enough does not alter the energy content of the 
held. This has been directly verihed with the DNS data by computing the 
relation (|^ and the various meshes and comparing the results. 

As mentioned above, interpolating the velocity held from the base to 
the rehned mesh is the most immediate although it must be noted that it 
results in a non solenoidal held. We wish to stress that this is not due to the 
tricubic Hermitian interpolation since, owing to the staggered arrangement 
of the velocity components, even a linearly interpolated velocity would be 
non solenoidal on the hne grid. 

Indeed one may also just use the tricubic interpolation “as-is”. Our sim¬ 
ulations in this chapter indicate that the global responses and turbulent 
statistics show great agreement with those by using a single rehned grid, and 
this non-solenoidal held has not resulted in apparent problems. The residual 
divergence is in fact very small (~ 0(10“^)) for most part of the how domain 
as long as the base mesh is in the DNS range for the momentum. Higher 
residual divergence may be observed in some very localised regions which 
are usually close to the wall boundary, such as the places where the scaler 
plumes develop. Notice that at every time step the rehned velocity held 
is interpolated from a solenoidal velocity held on the coarse grid, and the 
tricubic interpolation preserves the property of global divergence-free in our 
problems. Thus the error induced by the residual divergence should be small 
and does not accumulate during the time advance. Nevertheless, the ehect 
may get worse as the grid becomes more non-uniform which could become a 
problem at higher drivings. 

A drastic way of addressing this issue could be a “minimum energy” 
correction to fully remove the residual divergence of the interpolated velocity; 
this would require the solution of a Poisson equation for a velocity correction, 
with the local divergence as source term. While the multiple resolutions 
strategy would remain favourable, solving a Poisson equation on the hne 
grid would negate one of the main advantages of the method. 

Alternatively, we propose here a locally mass-conserving interpolation 
which produces a divergence-free velocity held without solving the Poisson 
equation on the hne grid. Notice that the velocity held on the coarse grid is 
solenoidal in the sense of dx:U + dyV -1- d^w = 0 at the cell center. Thus we hrst 
compute {dxU, dyV, d^w) at the cell center on the coarse grid, then interpolate 
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them to the cell center on the hne grid by using the tricubic interpolation. 
Since the three quantities are dehned at the same locations on both the 
coarse and hne grids, the interpolation involves the same stencil points and 
we must have d^u + dyV + dzW = 0 at the cell center on the hne grid. With 
the interpolated dyV, d^w) the velocity held can be reconstructed. For 
example, one can conduct a two-dimensional tricubic interpolation for u on 
an arbitrary y-z plane, then integrate along the a:-direction according the 
interpolated dxU. v can be treated similarly. For wall normal velocity w, one 
can just integrate dzW from the two plates where w = 0. 

Of course some price has to be paid by using this new interpolation. It 
requires more CPU time than the tricubic interpolation due to extra inte¬ 
gration and data communication between processes. Another issue is the 
accumulated error of the integration. For instance, the integral of the inter¬ 
polated dzW from one plate to the opposite one may not equal to zero, which 
will violate the impermeability condition at wall boundary. In practice, one 
can reduce the error by constructing w from both plates and taking the av¬ 
erage. In our simulation we found out that the error of w at two plates is of 
order 10“®, and the total hux caused by this error is basically machine zero 
at both plates. Furthermore, the difference between the total kinetic energy 
of the coarse grid and hne grid is usually about 0.1%. 

The previous discussion suggests that the proposed multiple resolution 
procedure can only work if the coarse mesh is hne enough to fully resolve 
the momentum held. In ^ numerical results will conhrm this statement 
showing that when the base mesh adequately resolves the momentum held 
good results and CPU time saving can be obtained by rehning only the grid 
for the scalar. On the other hand, if the coarse grid does not fully resolve the 
momentum held even very large values of Ai do not lead to correct results. 

4-2. Multiple resolution strategy in time 

The multiple resolution in space entails that the node spacing for the 
scalars (Ag) is smaller than that for the momentum (A[/) and this has 
immediate consequences on the stability of the time integration because 
of the explicit terms. Due to the Courant-Friedichs-Lewy condition I2SI. 
At ■ max[f//A 5 ] < Ccfl, in fact, the time step must decrease by a factor 
equal to the rehnement Ai because min[A 5 ] = min[A[/]/Ad. As this small 
At is not needed by the base mesh for momentum and pressure, the usage 
of the current approach becomes disadvantageous very rapidly, especially in 
high Sc hows requiring high values of At. However, a multiple resolution 
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strategy can be applied also in time, by advancing the more expensive mo- 
mentnm and pressnre with a larger time-step and the scalar with a smaller 
one, nsing a temporally interpolated velocity. In this way, the stability of the 
explicit terms in the scalar eqnation is retained withont the penalization of 
an nnnecessary small integration step for all the other eqnations. 

The integration of the scalar eqnation is therefore performed in L snb- 
steps, and at each intermediate I time level the velocity is linearly interpo¬ 
lated through 


q' = ( 10 ) 

where g"' is the spatially interpolated velocity at time step n. A simple linear 
interpolation is used, which is enough to ensure correctness as is demon¬ 
strated in ^ This velocity is then used to advect the scalar(s) for every 
subtimestep using equation (|^ until the scalars have been advanced to the 
time Then, the velocity can be advanced a further time-step and the 

procedure repeated. If the maximum possible CFL is used for the velocity 
equations, which is usually the case, then C> M. must be satished to ensure 
stability. 

5. Results and computational performance 

In this section we will present the results of the multiple resolution method, 
and the associated saving in computational time and memory usage. A more 
extensive analysis will be performed in the hrst part for RB flow while, in the 
second part, additional results with anisotropic rehnements will be shown for 
DDC flow. 

5.1. Rayleigh-Benard flow 

A series of rectangular RB simulations with aspect ratio T = 1 were run 
to validate and to demonstrate the benefits of the described method. Meshes 
of 96^ X 192, 128^ x 256 and 192^ x 384 (only for Pr = 1) were used for 
momentum, the Ra was kept constant at 10® while Pr was taken Pr = 1 or 
Pr = 10. In order to minimize the computational costs C = Ai was always 
used except for a specihc set of runs in which the effects of C were isolated. 

Figure |^a,&) shows the ratio between the different dehnitions of Nu 
and the maximum rehnement level Ai while the bottom panels report the 
convergence of Nu^^e to the asymptotic reference value, calculated from a 
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wiggle-free simulation such as the one seen in Figure [^6). The raw numerical 
values can be found in table A.2 at the end of the chapter. 





Figure 9: (a) Ratio between different ways of calculating Nu against grid size for the two 
base meshes and increasing spatial refinement for Pr = 1. (6) same as (a) for Pr = 10. 

(c) convergence to an asymptotic value of Nuu^e for increasing refinement and Pr = 1. 

(d) : same as (c) for Pr = 10. Circles are for base meshes of 96^ x 192, while triangles are 
for 192^ X 384 base meshes. On the top panels, blue indicates the ratio Nu^g/Nuu^e and 
red the ratio Nue^/Nuu^e- 

As mentioned before, the multiple resolution strategy only works if the 
base mesh is sufficiently hne to fully resolve the momentum held. Looking at 
hgurej^a) we can see that at Pr = 1 the 96^ x 192 mesh is not sufficiently 
hne, and the diherent Nu dehnitions do not show a monotonic convergence of 
the Nusselt ratios to the asymptote. It is interesting to note that for Ad = 4 
the Nu ratios get close to one, although the absolute values of Nu are wrong 
and do not indicate a convergence towards the reference grid independent 
value. On the other hand, the 192^ x 384 grid yields a converged value of 
Nug^^ even though that resolution is not enough for the computation of Nu^g 
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that involves squared temperature gradients. For this case a rehnement factor 
M. = C = 2 for the temperature gives an appropriate resolution as shown 
by the Nusselt numbers that converge to the reference value. Although a 
factor two in space and time might seem to produce only limited benehts, 
we should consider that it implies a grid for the momentum and pressure 
with 2^ less elements than for the temperature. In addition, the momentum 
equations are solved only once every other time sub-step therefore, even if 
there is an overhead introduced by the interpolation of the helds, the CPU 
time savings are substantial. In our case the standard simulation on the 
single grid 384^ x 768 required for the integration of dimensionless time unit 
a wall-time of 36.4 minutes on 96 processors, for a total of 58.2 CPU hours 
using 13 GB of RAM memory. Using £ = A4 = 2, one dimensionless time 
unit required a wall-time of 36.7 minutes on 48 processors for a total of 29.4 
CPU hours using 4.9 GB of RAM memory. 

The method becomes even more advantageous as the Prandtl number 
increases. In fact, from equations (|^-([^ we see that the nondimensional 
diffusivities of momentum and temperature, read Re = \/Ra/Pr and Pe = 
y/PaPr, respectively. Therefore for Pr > 1 the momentum held smoothens 
while the temperature held develops sharper gradients. This results in larger 
Kolmogorov rjK and smaller Batchelor rjB scales that need diherent meshes 
to be properly resolved. Figure shows that at Pr = 10 the increased 
momentum dihusivity makes even the relatively coarse grid 128^ x 256 ad¬ 
equate for the description of the momentum held. On the other hand, the 
same mesh is clearly too coarse for the scalar held as the ratio Nuu^e/Nu^^ 
strongly deviates from one. This grid, however, can be used to advance the 
momentum and to generate a rehned mesh to advect the temperature. For 
this case the convergence of the Nusselt numbers to the reference value is 
obtained for Ad = £ = 3 that yields a computational gain around a factor 
7 and a reduction of RAM memory by a factor 3.5 when compared to the 
reference cases using a single mesh. 

Before concluding this section we point out that for all simulations we 
have used a rehnement factor for the time step C = M.. Values of C smaller 
than M. can be used provided the CFL number for momentum is reduced so 
that the scalar integration remains stable; this increases the number of time 
steps needed to advance the simulation over the same physical time resulting 
in an increased computational cost. On the other hand, further increasing C 
beyond Ad does not modify the results within statistical error and empirical 
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evidence supporting this statement can be found in table A.3 at the end of 
the chapter. 


5.2. Double diffusive convection 

Double-diffusive convection (DDC) is a system where two scalars with 
very different diffusivities are coupled to the flow held, one of which is stabi¬ 
lizing and the other destabilizing. A relevant example of this system is the 
ocean, where the scalars are temperature {Pr ^ 7) and salinity {Sc ~ 700). 
The former being warmer at the top surface and cooler at the bottom stabi¬ 
lizes the how while the latter has the opposite ehect because a higher salinity 
at the top boundary results in a denser, sinking huid. A snapshot of the how 
in a DDC system for a geometry similar to RB can be seen in hgure[^ Very 
sharp gradients of salinity can be observed, while the temperature held is in 
a quasi-dihusive state. 



Figure 10: Instantaneous snapshot of (a) salinity {Scg = 700), (b) temperature {Pr = 7) 
and (c) vertical velocity in a DDC simulation at drivings of Rug = 5 x 10^ and Rut = 
5 X 10^. This results in Le = 100 and Rp = \. 

In this case, the computational gains associated with a multiple resolution 
can be even larger than before, owing to the very large Sc number of salt in 
water. In the simulations of Ref. [2B], the temperature is discretized without 
further rehnement, for salinity a factor Ad = 3 is used for all three directions. 
In this simulation we did not use the maximum possible CFL for the velocity 
equations, and £ = 1 for the time step. 

Similar to the RB case, the results in terms of heat transfer, salinity 
huxes and turbulence statistics agree within the uncertainty of 2-3% with 
those obtained using a single rehned grid for all the variables. Nevertheless, 
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using the present numerical approach momentum and temperature equations 
are solved on a mesh with 3^ less nodes than that for salinity, i.e., the coarse 
grid of 144 x 144 x 144 and the rehned grid of 432 x 432 x 432. By using 
two 12-core 2.6 GHz Intel Xeon E5-2690v3 (Haswell) CPUs with 12 MPI 
process and 2 OPENMP threads, the CPU time for one time step is about 
12.9 seconds for using single rehned grid, 2.69 seconds for multi grids with 
tricubic interpolation, and 2.98 seconds for multi grids with our mass con¬ 
serving interpolation, yielding a ssaving factor that exceeds 4. A decrease of 
RAM memory usage by more than 50% is also achieved by using the multi 
resolution strategy. 

Before concluding this section it is worthwhile to discuss briehy why in 
DDC it is possible to simulate a how with a Schmidt number of 700 without 
using a rehnement factor ~ 27. The DDC equations in nondimensional 
form read 


/ Prq 

dtUi - 1 - UjdjUi = -dip djjUi {Rp9 - S)Siz, ( 11 ) 

dte + UjdjO = dj.O, ( 12 ) 

where the how parameters are dehned as Rqt = gf^r^TL/{ vkt), Pr^ = 
i'/ht, Ro-s = gl^s^SL/iyKs) and Prs = Le = Pts/Rtt and Rp = 

RaxLe/Ras- 

It can be noted that the dihusivity of momentum Re = Prs/Ras 
increases with Prs- Therefore for large enough values the how does not 
fully transition to turbulence. On the other hand, the equations for salinity 
and temperature are linear and they can not sustain the cascade without a 
turbulent velocity held. This is indeed the case for the how parameters of the 
present numerical example (see also Ref. |26] ) where the fully turbulent three 
dimensional cascade cannot be achieved and a factor Ai = ~ 27 is not 

required for salinity. Even so, the multiple resolutions strategy results in a 
substantial gain factor in computational time and RAM memory occupation 
when compared against the single grid strategy. 
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6. Summary and conclusions 

In this paper we have presented a numerical strategy for the direct nu¬ 
merical simulation of turbulent flows with active and/or passive scalar fields 
without over-resolving the momentum equation and its pressure correction. 
This is certainly the case of flows with scalar diffusivity smaller than the kine¬ 
matic viscosity {Pr or So 1). Substantial computational time and memory 
occupation savings are even obtained for equally diffusive fields with Schmidt 
numbers of order unity. The different requirements for spatial discretization 
of scalars with respect to momentum originate not only from the diffusivity 
but also from the pressure. Its non-local effect was found to smoothen the 
momentum gradients and thereby reduce the resolution requirements with 
respect to resolving the scalar field. This scenario modifies the picture ob¬ 
tained from dimensional analysis that compares only the Kolmogorov and 
the Batchelor scales. 

To reduce computational costs, a multiple resolutions strategy was de¬ 
veloped in which momentum is discretized on a base mesh while scalars are 
discretized on a refined mesh. To solve the scalar diffusion-advection equa¬ 
tion, momentum is spatially interpolated onto the refined grid and either 
tricubic Hermitian splines or a more sophisticated procedure, based on the 
interpolation of velocity divergence, are proposed. The scalar is advanced in 
time, and if necessary, coarsened to couple it back to the momentum equa¬ 
tions. Due to stability constraints on the non-linear terms, the scalar is 
advanced in time using a refined timestep. Veiocity is lineariy interpoiated 
in time for aii the intermediate timesteps. The optimal amount of substeps 
C coincides with the grid refinement factor Ai, when it is isotropic, or with 
max[A^*] when it is anisotropic. 

The method was applied to Rayleigh-Benard convection, and decoupling 
the grid resolutions was found to result in computational speedups around 
two for Prandtl number unity, and seven for Pr = 10. This strategy was also 
applied to high Sc flows, also resulting in computational advantages of ap¬ 
proximately four. Due to the large costs, both in operations, memory usage 
and in communication associated to solving the Poisson equation, we expect 
the gains to increase for larger grids and larger drivings. This is because 
the Poisson solver is the most expensive part of advancing the Navier-Stokes 
equations in time, and this does not scale linearly with the amount of points, 
while the scalar diffusion-advection equations do. We expect gains of about 
three to four times for RB simulations at Pr = 1 and Ra = 10^^ with pro- 
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duction grids of about 1 billion points and A4 = 2, planned for the future. 
Also- the memory consumption is heavily reduced, by a factor 2.6x with a 
rehnement of two, and this makes some simulations possible on supercom¬ 
puters with a lower memory per core and decreases the dependence on high 
CPU-memory bandwidth. 

Once again, it is crucial that the base mesh is hne enough to correctly re¬ 
solve the momentum held. Adding more rehnement to the scalar mesh when 
the velocity grid is insufficient does not give an improvement of the quality 
of the results, and it might even lead to the suppression of small scales that 
violate energy conservation. This method could in principle be additionally 
applied to hows with Pr < 1. Obviously, in this case the velocity held should 
be solved on a mesh hner than that of the scalar. Although explicit numerical 
tests have not been attempted, we expect that the computational overhead 
introduced by the interpolation and coarsening of the helds overcomes the 
advantages produced by solving the scalar equation on a coarser mesh. 

Acknowledgments: We acknowledge FOM, an ERG Advanced Grant and 
computing time from SurfSARA (granted through NWO) and PRAGE grant 
2013091966. 

Appendix A. Numerical details 


N^X NyX W 

Pr 

Nuu^e 




96 X 96 X 192 

1 

66.8 

67.2 

71.0 

59.7 

128 X 128 X 256 

1 

64.9 

64.6 

67.1 

60.5 

192 X 192 X 384 

1 

63.2 

63.2 

64.1 

61.3 

256 X 256 X 512 

1 

63.8 

63.6 

64.6 

62.5 

384 X 384 X 768 

1 

64.0 

63.6 

64.3 

63.2 

96 X 96 X 192 

10 

68.4 

67.9 

68.6 

60.8 

128 X 128 X 256 

10 

65.5 

65.2 

65.7 

61.0 

192 X 192 X 384 

10 

63.2 

63.0 

63.1 

61.0 

256 X 256 X 512 

10 

63.6 

63.9 

63.6 

62.4 

384 X 384 X 768 

10 

63.4 

63.9 

63.4 

63.3 


Table A.l: Details of grid resolution used for standard single grid runs. Simulations were 
run until Nuu^e achieved 1% temporal convergence. All the simulations are performed at 
Ra = 10® and T = 1. The first column shows resolution, the second Pr, while the other 
four show the results of the different definitions of Nu. 
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N^x NyX 

M 

Pr 

Nuu^e 

Nue^ 


NUeg 

96 X 96 X 192 

2 

1 

65.5 

65.4 

68.4 

63.0 

96 X 96 X 192 

3 

1 

65.6 

65.3 

69.3 

69.5 

96 X 96 X 192 

4 

1 

65.4 

65.4 

66.0 

64.9 

128 X 128 X 256 

2 

1 

64.4 

64.4 

66.7 

66.7 

128 X 128 X 256 

3 

1 

64.4 

64.4 

66.9 

66.9 

192 X 192 X 384 

2 

1 

63.5 

63.4 

62.6 

64.5 

96 X 96 X 192 

2 

10 

66.6 

64.8 

60.0 

66.0 

96 X 96 X 192 

3 

10 

64.5 

64.5 

63.5 

63.5 

96 X 96 X 192 

4 

10 

65.4 

65.0 

64.2 

64.2 

128 X 128 X 256 

2 

10 

64.3 

64.2 

64.7 

63.0 

128 X 128 X 256 

3 

10 

64.6 

64.1 

62.2 

64.8 


Table A.2: Details of grid resolution used for multiple resolutions runs. Simulations were 
run until Nuu^g achieved 1% temporal convergence. All the simulations are performed at 
Ra = 10® and F = 1. The first column shows resolution, the second shows the refinement 
of the scalar grid in all directions, the third Pr, while the other four show the results for 
the different definitions of Nu. For all simulations L = M.. 


C 

CcFL 

Nuu,e 




1 

0.6 

64.5 

64.6 

67.3 

64.0 

2 

1.2 

64.4 

64.4 

66.8 

63.5 

3 

1.2 

64.6 

64.3 

67.2 

63.5 


Table A.3: Details of the testing for the temporal multiple resolutions. Simulations were 
run until Nuu^g achieved 1% temporal convergence. All the simulations are performed at 
Ra = 10® and T = 1 on a grid 128 x 128 x 256 with A4 = 2. The first column shows the 
time refinement level C, the second the maximum CFL computed on the momentum grid, 
while the last four show the results of the different definitions of Nu. 
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